Skip to content

Conversation

@tomerqodo
Copy link

@tomerqodo tomerqodo commented Dec 4, 2025

User description

Benchmark PR elastic#135875

Type: Clean (correct implementation)

Original PR Title: Allocation: introduce a new decider that balances the index shard count among nodes
Original PR Description: For a cluster with n data nodes to host an index with m shards, each node ideally should host not significantly more than m / n shards each. This new allocation decider acts on this principle to respond with a NOT_PREFERRED in event of a node being allocated more shards than threshold.

Nodes in shutdown are excluded when computing fair workload for nodes. A load skew tolerance setting is added to permit nodes to own more than ideal number of shards for the index.

Relates ES-12080
Original PR URL: elastic#135875


PR Type

Enhancement


Description

  • Introduce IndexBalanceAllocationDecider to balance index shard distribution

  • Add settings to control decider behavior with excess shard tolerance

  • Register new decider in cluster module allocation pipeline

  • Extend DiscoveryNodeFilters with hasFilters() utility method

  • Improve test infrastructure with shared assertion helper


Diagram Walkthrough

flowchart LR
  A["IndexBalanceConstraintSettings<br/>(Configuration)"] --> B["IndexBalanceAllocationDecider<br/>(New Decider)"]
  B --> C["ClusterModule<br/>(Registration)"]
  D["DiscoveryNodeFilters<br/>(hasFilters utility)"] --> B
  E["FilterAllocationDecider<br/>(Exposed constants)"] --> B
  B --> F["Allocation Pipeline<br/>(Shard Balancing)"]
Loading

File Walkthrough

Relevant files
Enhancement
6 files
IndexBalanceAllocationDecider.java
New allocation decider for index shard balancing                 
+172/-0 
ClusterModule.java
Register IndexBalanceAllocationDecider in allocation pipeline
+2/-0     
DiscoveryNodeFilters.java
Add hasFilters utility method to check filter presence     
+4/-0     
BalancedShardsAllocator.java
Expose getAllocation and make ProjectIndex record public for testing
+7/-1     
WeightFunction.java
Make calculateNodeWeightWithIndex method public for testing
+2/-1     
FilterAllocationDecider.java
Expose filter prefix constants for reuse by other deciders
+3/-3     
Configuration changes
2 files
IndexBalanceConstraintSettings.java
Settings for index balance decider configuration                 
+59/-0   
ClusterSettings.java
Register IndexBalanceConstraintSettings in cluster settings
+3/-0     
Tests
4 files
IndexBalanceAllocationDeciderTests.java
Comprehensive test suite for index balance allocation decider
+365/-0 
ClusterModuleTests.java
Update allocation decider order test to include new decider
+3/-1     
BalancedShardsAllocatorTests.java
Update test to use public calculateNodeWeightWithIndex method
+1/-1     
ESAllocationTestCase.java
Add shared assertDecisionMatches helper for allocation tests
+13/-0   
Miscellaneous
1 files
WriteLoadConstraintDeciderTests.java
Remove duplicate assertion helper moved to base test class
+0/-14   

@qodo-code-review
Copy link

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
No audit logs: The new allocation decision paths add or deny shard placements without emitting audit logs
containing actor, timestamp, action, and outcome.

Referred Code
@Override
public Decision canAllocate(ShardRouting shardRouting, RoutingNode node, RoutingAllocation allocation) {
    if (indexBalanceConstraintSettings.isDeciderEnabled() == false || isStateless == false || hasFilters()) {
        return allocation.decision(Decision.YES, NAME, "Decider is disabled.");
    }

    Index index = shardRouting.index();
    if (node.hasIndex(index) == false) {
        return allocation.decision(Decision.YES, NAME, "Node does not currently host this index.");
    }

    assert node.node() != null;
    assert node.node().getRoles().contains(INDEX_ROLE) || node.node().getRoles().contains(SEARCH_ROLE);

    if (node.node().getRoles().contains(INDEX_ROLE) && shardRouting.primary() == false) {
        return allocation.decision(Decision.YES, NAME, "An index node cannot own search shards. Decider inactive.");
    }

    if (node.node().getRoles().contains(SEARCH_ROLE) && shardRouting.primary()) {
        return allocation.decision(Decision.YES, NAME, "A search node cannot own primary shards. Decider inactive.");
    }


 ... (clipped 53 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Edge-case handling: The decider assumes non-empty eligible nodes and positive shard counts, returning YES when
empty, but lacks explicit guards or metrics for unusual states (e.g., all nodes
filtered/shutdown) beyond assertions.

Referred Code
final ProjectId projectId = allocation.getClusterState().metadata().projectFor(index).id();
final Set<DiscoveryNode> eligibleNodes = new HashSet<>();
int totalShards = 0;
String nomenclature = EMPTY;

if (node.node().getRoles().contains(INDEX_ROLE)) {
    collectEligibleNodes(allocation, eligibleNodes, INDEX_ROLE);
    // Primary shards only.
    totalShards = allocation.getClusterState().routingTable(projectId).index(index).size();
    nomenclature = "index";
} else if (node.node().getRoles().contains(SEARCH_ROLE)) {
    collectEligibleNodes(allocation, eligibleNodes, SEARCH_ROLE);
    // Replicas only.
    final IndexMetadata indexMetadata = allocation.getClusterState().metadata().getProject(projectId).index(index);
    totalShards = indexMetadata.getNumberOfShards() * indexMetadata.getNumberOfReplicas();
    nomenclature = "search";
}

assert eligibleNodes.isEmpty() == false;
if (eligibleNodes.isEmpty()) {
    return allocation.decision(Decision.YES, NAME, "There are no eligible nodes available.");


 ... (clipped 4 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-code-review
Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Decider should respect allocation filters

The IndexBalanceAllocationDecider currently deactivates when cluster-wide
allocation filters are used. It should instead incorporate these filters to
correctly determine the set of eligible nodes for balancing calculations.

Examples:

server/src/main/java/org/elasticsearch/cluster/routing/allocation/decider/IndexBalanceAllocationDecider.java [74-76]
        if (indexBalanceConstraintSettings.isDeciderEnabled() == false || isStateless == false || hasFilters()) {
            return allocation.decision(Decision.YES, NAME, "Decider is disabled.");
        }
server/src/main/java/org/elasticsearch/cluster/routing/allocation/decider/IndexBalanceAllocationDecider.java [147-153]
    private void collectEligibleNodes(RoutingAllocation allocation, Set<DiscoveryNode> eligibleNodes, DiscoveryNodeRole role) {
        for (DiscoveryNode discoveryNode : allocation.nodes()) {
            if (discoveryNode.getRoles().contains(role) && allocation.metadata().nodeShutdowns().contains(discoveryNode.getId()) == false) {
                eligibleNodes.add(discoveryNode);
            }
        }
    }

Solution Walkthrough:

Before:

// In IndexBalanceAllocationDecider.java

public Decision canAllocate(ShardRouting shardRouting, RoutingNode node, RoutingAllocation allocation) {
    if (hasFilters()) {
        // Decider deactivates itself if any cluster-wide filters are present.
        return allocation.decision(Decision.YES, NAME, "Decider is disabled.");
    }
    // ...
    Set<DiscoveryNode> eligibleNodes = new HashSet<>();
    collectEligibleNodes(allocation, eligibleNodes, role); // Considers all nodes of a certain role.
    // ...
}

private void collectEligibleNodes(RoutingAllocation allocation, Set<DiscoveryNode> eligibleNodes, DiscoveryNodeRole role) {
    for (DiscoveryNode discoveryNode : allocation.nodes()) {
        if (discoveryNode.getRoles().contains(role) && /* not shutting down */) {
            eligibleNodes.add(discoveryNode);
        }
    }
}

After:

// In IndexBalanceAllocationDecider.java

public Decision canAllocate(ShardRouting shardRouting, RoutingNode node, RoutingAllocation allocation) {
    // The check for hasFilters() is removed.
    // ...
    Set<DiscoveryNode> eligibleNodes = new HashSet<>();
    collectEligibleNodes(allocation, eligibleNodes, role); // Now respects filters.
    // ...
}

private void collectEligibleNodes(RoutingAllocation allocation, Set<DiscoveryNode> eligibleNodes, DiscoveryNodeRole role) {
    for (DiscoveryNode discoveryNode : allocation.nodes()) {
        if (discoveryNode.getRoles().contains(role) && /* not shutting down */
            && matchesFilters(discoveryNode)) { // Apply cluster-wide filters
            eligibleNodes.add(discoveryNode);
        }
    }
}

private boolean matchesFilters(DiscoveryNode node) {
    // Logic to check node against clusterRequireFilters,
    // clusterIncludeFilters, and clusterExcludeFilters.
    // Similar to FilterAllocationDecider logic.
}
Suggestion importance[1-10]: 9

__

Why: This suggestion correctly identifies a major design limitation that severely restricts the new decider's applicability in common real-world scenarios, such as tiered architectures, and proposes a crucial enhancement to make the feature broadly useful.

High
  • More

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants